Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] implement redis cache mode #1222

Merged
merged 17 commits into from
Jan 20, 2024
Merged

Conversation

vijaykramesh
Copy link
Contributor

@vijaykramesh vijaykramesh commented Jan 12, 2024

Why are these changes needed?

This adds a redis mode to the cache. This way I can have multiple processes in separate containers running the same application (that is using autogen) and they can share LLM cache (vs in the current disk cache implementation the SQLIte instance ends up being machine local and can't be easily shared across multiple containers/pods).

The actual redis caching is using pickling, same as the disk cache implementation uses. So the cache should be functionally equivalent to the disk cache version.

Docs added inline and then also agent_chat.md:

LLM Caching

Legacy Disk Cache

By default, you can specify a cache_seed in your llm_config in order to take advantage of a local DiskCache backed cache. This cache will be used to store the results of your LLM calls, and will be used to return results for the same input without making a call to the LLM. This is useful for saving on compute costs, and for speeding up inference.

assistant = AssistantAgent(
    "coding_agent",
    llm_config={
        "cache_seed": 42,
        "config_list": OAI_CONFIG_LIST,
        "max_tokens": 1024,
    },
)

Setting this cache_seed param to None will disable the cache.

Configurable Context Manager

A new configurable context manager allows you to easily turn on and off LLM cache, using either DiskCache or Redis. All LLM agents inside the context manager will use the same cache.

from autogen.cache.cache import Cache

with Cache.redis(cache_seed=42, redis_url="redis://localhost:6379/0") as cache_client:
    user.initiate_chat(assistant, message=coding_task, cache_client=cache_client)

with Cache.disk(cache_seed=42, cache_dir=".cache") as cache_client:
    user.initiate_chat(assistant, message=coding_task, cache_client=cache_client)

Here's an example of the new integration test running in CI (note I had to setup my fork to get it to run, I think it will only run when it is on main that is being merged into? - and in my fork the other tests fail due to my OAI_CONFIG_LIST not being correct.

Screenshot 2024-01-12 at 3 47 39 PM

Integration test coverage for the new code I added:

➜  autogen git:(vr/redis_cache) ✗ coverage run -a -m pytest test/agentchat/test_cache.py
=============================================================================================================== test session starts ===============================================================================================================
platform darwin -- Python 3.11.4, pytest-7.4.4, pluggy-1.3.0
rootdir: /Users/vijay/oss/autogen
configfile: pyproject.toml
plugins: Faker-19.3.0, anyio-3.7.1
collected 3 items


test/agentchat/test_cache.py ...                                                                                                                                                                                                            [100%]

=============================================================================================================== 3 passed in 47.20s ================================================================================================================

Screenshot 2024-01-14 at 1 47 43 PM

And then per PR feedback I added some unit tests for the cache implementations.

➜  autogen git:(vr/redis_cache) ✗ coverage run -a -m pytest test/cache
=============================================================================================================== test session starts ===============================================================================================================
platform darwin -- Python 3.11.4, pytest-7.4.4, pluggy-1.3.0
rootdir: /Users/vijay/oss/autogen
configfile: pyproject.toml
plugins: Faker-19.3.0, anyio-3.7.1
collected 14 items


test/cache/test_cache.py ....                                                                                                                                                                                                               [ 28%]
test/cache/test_disk_cache.py .....                                                                                                                                                                                                         [ 64%]
test/cache/test_redis_cache.py .....                                                                                                                                                                                                        [100%]

=============================================================================================================== 14 passed in 1.24s ================================================================================================================
Screenshot 2024-01-14 at 1 48 30 PM

Related issue number

Checks

@vijaykramesh
Copy link
Contributor Author

@microsoft-github-policy-service agree company="Regrello"

@davorrunje
Copy link
Contributor

Redis cache and disk cache have different behaviors after being closed:
...
My thinking is that the cache should detaches itself from every agent instances once it exits the with context.

I agree. This will certainly cause problems in future if not unified.

@ekzhu
Copy link
Collaborator

ekzhu commented Jan 20, 2024

@sonichi @vijaykramesh @davorrunje

Sorry I was incorrect to say that Redis and DiskCache will have different exit behaviors. I tested myself and there is no difference. Both caches will stay alive and re-opens once you use it again.

@ekzhu
Copy link
Collaborator

ekzhu commented Jan 20, 2024

I pushed another commit to make sure a_run_chat has the same handling for cache as run_chat.

@sonichi sonichi added this pull request to the merge queue Jan 20, 2024
Merged via the queue into microsoft:main with commit ee6ad8d Jan 20, 2024
97 checks passed
corleroux pushed a commit to corleroux/autogen that referenced this pull request Jan 30, 2024
* implement redis cache mode, if redis_url is set in the llm_config then
it will try to use this.  also adds a test to validate both the existing
and the redis cache behavior.

* PR feedback, add unit tests

* more PR feedback, move the new style cache to a context manager

* Update agent_chat.md

* more PR feedback, remove tests from contrib and have them run with the normal jobs

* doc

* updated

* Update website/docs/Use-Cases/agent_chat.md

Co-authored-by: Chi Wang <[email protected]>

* update docs

* update docs; let openaiwrapper to use cache object

* typo

* Update website/docs/Use-Cases/enhanced_inference.md

Co-authored-by: Chi Wang <[email protected]>

* save previous client cache and reset it after send/a_send

* a_run_chat

---------

Co-authored-by: Vijay Ramesh <[email protected]>
Co-authored-by: Eric Zhu <[email protected]>
Co-authored-by: Chi Wang <[email protected]>
whiskyboy pushed a commit to whiskyboy/autogen that referenced this pull request Apr 17, 2024
* implement redis cache mode, if redis_url is set in the llm_config then
it will try to use this.  also adds a test to validate both the existing
and the redis cache behavior.

* PR feedback, add unit tests

* more PR feedback, move the new style cache to a context manager

* Update agent_chat.md

* more PR feedback, remove tests from contrib and have them run with the normal jobs

* doc

* updated

* Update website/docs/Use-Cases/agent_chat.md

Co-authored-by: Chi Wang <[email protected]>

* update docs

* update docs; let openaiwrapper to use cache object

* typo

* Update website/docs/Use-Cases/enhanced_inference.md

Co-authored-by: Chi Wang <[email protected]>

* save previous client cache and reset it after send/a_send

* a_run_chat

---------

Co-authored-by: Vijay Ramesh <[email protected]>
Co-authored-by: Eric Zhu <[email protected]>
Co-authored-by: Chi Wang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants